Erasure coding and redundant replication

ABSTRACT

Disclosed are various embodiments for employing an erasure coding storage scheme and a redundant replication storage scheme in a data storage system. Data objects that are greater than a size threshold and accessed less frequently than an access threshold are stored in an erasure coding scheme, while data objects that are sized less than a size threshold or accessed more often than an access threshold are stored in a redundant replication storage scheme.

BACKGROUND

Various methods are employed to increase data durability of data in arelational database management system, a non-relational data storagesystem, or other distributed data storage system or distributeddatabase. In large scale distributed data storage systems, redundantreplication, where multiple copies of a data object are stored inmultiple nodes of a distributed data storage system, which can also bedisparately located across multiple data centers, can be employed toincrease data durability. The storage costs of employing a redundantreplication scheme as the amount and number of data objects in thedistributed data storage system grows can be quite high.

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the present disclosure can be better understood withreference to the following drawings. The components in the drawings arenot necessarily to scale, emphasis instead being placed upon clearlyillustrating the principles of the disclosure. Moreover, in thedrawings, like reference numerals designate corresponding partsthroughout the several views.

FIGS. 1-4 are drawings of a data storage system according to variousembodiments of the present disclosure.

FIGS. 5-7 are flowcharts illustrating one example of functionalityimplemented as portions of the data storage application executed in acomputing device of FIG. 1 according to various embodiments of thepresent disclosure.

FIG. 8 is a schematic block diagram that provides one exampleillustration of a computing device employed in the data storage systemof FIG. 1 according to various embodiments of the present disclosure.

DETAILED DESCRIPTION

Embodiments of the present disclosure provide a data storage system inwhich data objects can be stored according to various storage schemesthat increase data durability. As can be appreciated, a redundantreplication storage scheme involves the storage of multiple copies of adata object across various nodes to improve reliability of the datastorage system. In such a scenario, in the event of the failure of oneof the nodes in a data storage system, a copy of the data object can beretrieved from another node. In a data storage system housing largeamounts of data, exclusive use of such a storage scheme can result inhigh physical storage costs, as the capacity of nodes must be such thateach can house the entirety of the data objects in the data storagescheme.

An erasure coding storage scheme can reduce storage costs, as such ascheme involves splitting data objects into multiple shards or fragmentsthat are each sized less than the size of a data object encoded in theerasure coding scheme, and storing a subset of the shards in each of thenodes of the data storage system. In some embodiments, a total size ofthe multiple shards or fragments is greater than or equal to the size ofa data object that is encoded in an erasure coding scheme. As oneexample, each node can store one of the shards. Accordingly, as can beappreciated in an erasure coding scheme, the data object then can bereconstructed from less than all of these shards. However, in order toretrieve the data object from the data storage system, the CPU and I/Ooperations needed to reconstruct a data object in this fashion can behigher relative to retrieval of a data object stored in a redundantreplication storage scheme. Therefore, embodiments of the disclosure canstore various data objects in varying storage schemes according tovarious factors that balance storage costs as well as computationalcosts of retrieval of the data objects.

With reference to FIG. 1, shown is a data storage system comprising aplurality of data store nodes 101 and at least one computing device 103according to an embodiment of the present disclosure. In one example ofa data storage system according to an embodiment of the disclosure,there can be any number (N) of data store nodes 101 that house dataobjects that are accessible via a computing device executing a datastorage application 105. It is understood that data store nodes 101 in adata storage system may be disparately located across various datacenters and/or networks to increase reliability, disaster recoverycapability, latency, and/or other considerations as can be appreciated.In one embodiment, the data store nodes 101 are in data communicationwith one or more computing devices 103 as well as each other over anappropriate network. The computing device 103 can in turn be incommunication with one or more clients 109 over the network. Such anetwork may comprise, for example, the Internet, intranets, wide areanetworks (WANs), local area networks (LANs), wireless networks, or othersuitable networks, etc., or any combination of two or more suchnetworks.

The computing device 103 may comprise, for example, a server computer orany other system providing computing capability. Alternatively, aplurality of computing devices 103 may be employed that are arranged,for example, in one or more server banks or computer banks or otherarrangements. For example, a plurality of computing devices 103 togethermay comprise a cloud computing resource, a grid computing resource,and/or any other distributed computing arrangement. Such computingdevices 103 may be located in a single installation or may bedistributed among many different geographical locations. For purposes ofconvenience, the computing device 103 is referred to herein in thesingular. Even though the computing device is referred to in thesingular, it is understood that a plurality of computing devices 103 maybe employed in the various arrangements as described above.Additionally, the data store nodes 101 can also be implemented in acomputing device as described above.

Various applications and/or other functionality may be executed in thecomputing device 103 according to various embodiments. The componentsexecuted on the computing device 103, for example, include a datastorage application 105, and other applications, services, processes,systems, engines, or functionality not discussed in detail herein. Thedata storage application 105 is executed to manage access and storage todata objects stored in a data storage system that also includes thevarious data store nodes 101. The data storage application 105 canreceive requests from clients 109 to store, modify, and/or retrieve dataobjects from the data storage systems. As will be described in furtherdetail herein, these data objects can be stored across the various datastore nodes 101 in various encoding schemes.

The computing device 103 can maintain a data object index 111 that canmaintain information about regarding data objects stored in the datastorage system across the various data store nodes 101. The index 111can include, for example, a location in the data store nodes 101 of dataobjects, a size, an encoding scheme of the data object as it is storedin the data storage system, and other information. In some embodiments,the index 111 can also include other information regarding data objectsdepending upon the implementation of a data storage system. For example,the index 111 can include a timestamp that reveals when a data objectwas created, accessed, modified, etc. In other words, the index 111 caninclude any information about data objects and/or fragments or shards ofa data object stored in the data storage system that facilitate storageand retrieval of data objects in the data storage system.

The computing device 103 can also maintain a log 113 that can record ahistory of activity regarding data objects stored in the data storagesystem. In some embodiments, the log 113 can an access log that recordsa history of accesses of the data objects. In other words, the datastorage application 105 can record each time a data object is accessedby a client 109 in the log 113. The data storage application 105 canrecord other information in the log 113 as can be appreciated, such asinformation about when an object is created, modified, or otherhistorical data about data objects as can be appreciated.

Depending upon an implementation of a data storage system according toan embodiment of this disclosure, information about data objects in thedata storage system can be stored in either the index 111, the log 113,or both. As one example, the data storage application 105 can store amost recent access of a data object in the index 111 in an entryassociated with the data object, while the log 113 can store a record ofeach time a data object is accessed. Additionally, in one embodiment,the computing device 103 can maintain the index 111 in memory so thatthe index 111 can be quickly retrieved and/or manipulated and dataobjects can be quickly retrieved from the various data store nodes 101.In other words, the index 111 can be maintained in memory to improveperformance of the data storage system. Alternatively, the log 113 canbe stored and/or maintained in a data store, solid state storage system,hard disk drive, or other storage system, as the data storageapplication 105 may not need to quickly access the log 113 forperformance reasons, and the amount of data stored in the log 113 mayrender maintaining the log 113 in memory prohibitively impractical.

However, other variations of an implementation of the computing device103 as it pertains to the arrangement of data in an index 111 and/or log113 should be appreciated by a person of ordinary skill in the art. Asone example, in one embodiment of a data storage system the index 111may only maintain a storage location among the data store nodes 101 of adata object, while other data regarding the object, such as an encodingscheme and timestamp, can be stored in the log 113. In otherembodiments, a data storage system may store all relevant informationabout data objects in a log 113 and forego the use of an index 111altogether. Other variations should be appreciated, and theimplementation discussed above is but one example given for illustrativepurposes only.

The components executed on the data store nodes 101, for example,include a data store server 119, and other applications, services,processes, systems, engines, or functionality not discussed in detailherein. The data store server 119 can be in communication with the datastorage application 105 and facilitate storage and/or retrieval of datato data objects stored in a data store node 101. The data store server119 can receive requests from the data storage application 105 to store,modify, and/or retrieve data objects in a data store node 101 that is apart of a data storage system. A data store node 101 can also include adata store 121 in which data objects can be stored. As will be discussedherein, in some embodiments, a copy of a data object can be stored inthe data store 121 as can fragments or shards of a data object.

The client 109 is representative of a plurality of client devices thatmay be in communication with the computing device 103 over a network.The client 109 may comprise, for example, a processor-based system suchas a computer system. Such a computer system may be embodied in the formof a desktop computer, a laptop computer, a server computer, a cloudcomputing resource, a grid computing resource, or other devices orsystems with like capability. The client 109 may be configured toexecute various applications such as a data store client application 151and/or other applications. The data store client application 151 may beexecuted in a client 109 to facilitate interaction with the data storageapplication 105. In one embodiment, the data store client application151 may be configured, for example, to access and render network pages,such as web pages, or other network content served up by the computingdevice 103, and/or other servers for the purpose of interfacing with thedata storage application 105.

In various embodiments, the data store client application 151 maycomprise a thin client application, a thick client application, oranother type of client application. Some embodiments may include agraphical user interface and/or a command-line interface. In someembodiments, the client 109 can be configured to interact with a datastorage system provided by the computing devices 103 as well as the datastore nodes 101a . . . 106N via an application programming interface(API) provided by the data storage application 105 executed in acomputing device 103.

Although the data store client application 151 is described as executedin a client 109, it is understood that the client 109 may correspond toa server computer that processes business logic, generates networkpages, and/or performs other tasks. Thus, although requests to store,modify, and/or retrieve a data object in the data storage system can beinitiated by a user through a user interface provided by a data storeclient application 151 and/or the data storage application 105, such arequest may also be generated automatically by business logicapplications, workflow engines, content servers, application servers,and/or other applications.

The data store client application 151 may correspond to a portion ofanother application, such as, for example, a module, a library, etc. invarious embodiments. A request to access the data storage system may besent over a network to the data storage application 105 using hypertexttransfer protocol (HTTP), simple object access protocol (SOAP), remoteprocedure call (RPC), remote method invocation (RMI), a proprietaryprotocol and/or other protocols.

Next, a general description of the operation of the various componentsof a data storage system according to an embodiment of the disclosure isprovided. FIG. 1 illustrates an example of a data object 153 beingstored in a data storage system facilitated by the computing device 103and the data store nodes 101a . . . 101N. In the depicted example, thedata object 153 is stored in the data storage system in a redundantreplication storage scheme across the various data store nodes 101.Accordingly, in one example, a data object 153 can be submitted by aclient 109 to the data storage application 105 for storage in the datastorage system. The data storage application 105 can then facilitatestorage of a data object copy 155a . . . 155N in the various data storenodes 101a . . . 101N.

As described above, such a redundant scheme can provide increased datadurability, as the data store nodes 101 can be disparately located amongmultiple server power supplies, server cabinets, data centers,geographic locations, and the like. However, exclusive use of aredundant replication storage scheme results in the need a storagecapacity in each of the data store nodes 101 that is at least a factorof N greater than the total size of the data objects stored in the datastorage system.

Upon storage of the data object 153 in the data store nodes 101a . . .101N of the data storage system, the data storage application 105 canindex the location of the data object copy 155a . . . 155N in thevarious data store nodes 101a . . . 101N in the index 111. In oneembodiment, the data storage application 105 can generate a uniqueidentifier associated with the data object 153 that is stored in theindex 111 in an entry associated with the data object 153 in the index111. Accordingly, a data store server 119 associated with a data storenode 101 can retrieve a data object copy 155 from the data store 121using this unique identifier. In one example, the data store server 119can maintain a location in the data store 121 associated with a uniqueidentifier associated with the data object, and the data store server119 can retrieve a data object copy 155 from its location in the datastore 121 when requested by the data storage application 105.Additionally, the data storage application 105 can record any requeststo access the data object 153 in the log 113.

Reference is now made to FIG. 2, which illustrates how the data object153 can be retrieved from or accessed in the data storage system.Assuming the data store node 101a has failed in some way, because thedata object 153 was stored in a redundant replication storage schemeamong the data store nodes 101a . . . 101N, the data storage application105 can respond to a request from a client 109 to retrieve the dataobject 153 by retrieving a data object copy 155 from any of the otherdata store nodes 101b . . . 101N. In the depicted example, the datastorage application 105 can retrieve a data object copy 155b from thedata store node 101b.

Reference is now made to FIG. 3, which depicts an example of storage ofa data object 153 using an erasure encoding storage scheme. In thedepicted example, the data storage application 105 can receive a dataobject 153 from a client 109 for storage in the data storage system.Accordingly, to implement an erasure coding algorithm on the data object153, the data storage application 105 can split the data object 153 intoa first plurality of shards or fragments. The data storage application105 can then generate additional shards or fragments from the firstplurality of shards or fragments as a part of an erasure codingalgorithm. The data storage application 105 can then store a subset ofthese data object shards 358a . . . 358N, which are sized less than thesize of the original data object 153, in the data store nodes 101. Inone example, the data storage application 105 can store one shard ineach of the data store nodes 101a . . . 101N.

Stated another way, in one example, the data storage application 105 cansplit the data object 153 into k shards, which are sized, to the extentpossible, proportionally to the size of the data object 153. In otherwords, the size of each of the k shards can be expressed asapproximately 1/k of the size of the data object 153. Accordingly, fromthese k shards, the data storage application 105 can generate anadditional n-k shards of a size that is similar to the first k shards,resulting in a total of n data object shards 358a . . . 358N associatedwith the data object 153. Accordingly, one of then data object shards358 can be stored in each of the data store nodes 101a . . . 101N.Therefore, the amount of data storage needed in the data storage systemto store the n data object shards 358 can be expressed as approximatelyn/k*S, where S is the size of the data object 153. Additionally, byemploying an erasure coding algorithm, the data storage application 105can recover the original data object using any k of then shards, meaningthe data object 153 is durably stored until more than n-k data storenodes 101 experience a failure.

In one example, an erasure coding scheme where n is twelve and k is six,which means that in order to store in the data object 153 among the datastore nodes 101, a total storage space required in the data storagesystem is twice the original size of the data object. Additionally, thedata is durably stored in the data storage system until seven of thedata store nodes 101 experience failure. In contrast, to store the samedata object 153 in a redundant replication storage scheme across onlythree data store nodes 101, the total storage space required in the datastorage system is three times the original size of the data object 153.

The data storage application 105 can index a location in the data storenodes 101a . . . 101N in the index 111 so that the data object 153 canbe reconstructed and retrieved on behalf of a requesting client 109 aswell as log any requests to access the data object 153 in the log 113.

Reference is now made to FIG. 4, which illustrates retrieval of a dataobject 153 from the data storage application 105. Assuming a failure ofone or more data stores nodes 101, upon receiving a request from aclient 109 to retrieve a data object 153, the data storage application105 can reconstruct the data object 153 from a subset of the data objectshards 358 stored in the remaining data store nodes 101. As can beappreciated, reconstructing a data object 153 by employing an erasurecoding algorithm can be computationally intensive relative to the aredundant replication storage scheme. Additionally, reconstructing adata object 153 can also require more I/O operations, as a plurality ofshards must be retrieved from the data store nodes 101 in a data storagesystem in order to reconstruct the data object 153. Therefore, in someembodiments, although employing an erasure coding scheme can reduce theoverall storage requirements to achieve a desired data durability,retrieving a data object 153 stored in an erasure coding storage schemecan result in higher relative latency due to the need to reconstruct thedata object 153 from a plurality of data object shards 358.

Accordingly, embodiments of the present disclosure can store dataobjects using a mix of redundant replication and erasure coding toachieve a desired balance between these storage and performanceconsiderations. In some data storage systems, a large percentage of theoverall storage capacity of the data storage system is consumed byrelatively few large objects. Additionally, in some data storagesystems, a large percentage of the most frequently accessed data storagesystems comprise data objects that are relatively small in size.Accordingly, one way to achieve a balance between is to employ anerasure coding storage scheme for those data objects that are relativelylarge and are rarely accessed. In this way, the total amount of storagespace within the data storage system that is devoted to storage of thesedata objects can be reduced, and the performance degradation of the datastorage system due to the need to reconstruct the data object using anerasure coding algorithm when the data object is retrieved is acceptablebecause the data object is rarely accessed.

Additionally, it can be determined that the performance penalty ofaccessing a small data object stored in an erasure coding storage schemethat is also rarely accessed may be undesirable, as storing a smallobject in a redundant replication scheme consumes relatively littlestorage capacity, even though the data object is rarely accessed.Because, in many data storage systems, there can be a large number ofsmall data objects stored therein, storing small data objects in anerasure coding scheme can result in an unacceptably large index 111, aseach of the data object shards associated with the small data object isindexed in the index 111 so that the data storage application 105 canretrieve a shard to reconstruct the data object.

As one illustrative non-limiting example, in some data storage systems,data objects that are sized less than 128 kilobytes (kb) can represent90% of the total number of data objects stored in the data storagesystem, whereas these same objects can represent less than 10% of thetotal storage capacity consumed in the data storage system.Additionally, as another illustrative non-limiting example, theseobjects that are sized less than 128 kb can represent more than 90% ofthe data objects that are accessed by clients 109. In other words, theseobjects can represent more than 90% of “traffic.”

Therefore, a data object size distribution of the data objects stored inthe data storage system can be generated that can be analyzed todetermine a size threshold that represents a relatively small number ofdata objects that also represents a relatively large amount of the totalstorage capacity consumed in the data storage system. Additionally, anaccess pattern distribution can be generated to determine an accessthreshold that can be related to a size of data objects in the datastorage system that are relatively rarely accessed. Accordingly, in oneembodiment of the present disclosure, the data storage application 105can store those objects that are greater than a particular sizethreshold in an erasure coding storage scheme. Additionally, in anotherembodiment, the data storage application 105 can store those objectsthat are rarely accessed in an erasure coding scheme. For example, thedata storage application 105 can determine those objects that are rarelyaccessed over a particular period of time (e.g., the previoustwenty-four hours, the previous seven days, the previous thirty days,etc.). As another example, the data storage application 105 can storethose objects that are sized greater than or equal to the size thresholdand accessed less often during a period of time than the accessthreshold in an erasure coding scheme.

In some embodiments, the data storage application 105 can continuallyadapt these thresholds to maintain a balance between data objects storedin a redundant replication scheme and an erasure coding storage scheme.For example, the data storage application 105 can periodically generatean object size distribution and identify a size threshold thatrepresents the largest ten percent of data objects in the data storagesystem. Continuing this non-limiting example, the data storageapplication 105 can periodically generate an access pattern distributionand identify an access threshold that represents the ten percent of dataobjects that are accessed least frequently.

Upon identifying these thresholds, the data storage application 105 canconvert a storage scheme of data objects stored in the data storagesystem in a redundant replication scheme that are greater than the sizethreshold and/or accessed less often than the access threshold into anerasure coding storage scheme. Additionally, generating an accesspattern distribution can also involve identifying those objects that aremost frequently accessed in the data storage system. Accordingly, uponidentifying these most frequently accessed data objects in the datastorage system, the data storage application 105 can also convert astorage scheme of these data objects to a redundant replication storagescheme if they are presently stored in an erasure coding storage scheme.The data storage application 105 can perform this conversion even if thedata object is sized greater than the size threshold to reduce thelatency associated with retrieval of such a data object. In other words,the data storage application 105 can identify those objects that are“hot,” meaning they are frequently accessed, and ensure that they storedin a redundant replication storage scheme.

In one embodiment, the data storage application 105 can generate anobject size distribution by scanning the index 111, which can include adata object size entry associated with at least one data object in thedata storage system. In another embodiment, the data storage application105 can scan log entries in the log 113 that may include sizeinformation associated with the data objects in the data storagessystem. In another embodiment, the data storage application 105 cangenerate an access pattern distribution by scanning an access logassociated with the log 113.

In some embodiments, the data storage application 105 can generate anobject size distribution and/or an access pattern distribution bysampling the index 111 and/or log 113, as examining each entry in theindex 111 and/or log 113 may computationally and/or resource intensive.In the case of generating an access pattern distribution by sampling anaccess log, for example, such an access pattern distribution may notidentify those data objects that are less frequently accessed, as theseobjects may be associated with few or no entries in such an access log.However, sampling an index 111 and/or log 113 in order to generate anaccess pattern distribution is likely to identify data objects that arefrequently accessed, and the data storage application 105 can identify adata object size associated with these data objects. The data storageapplication 105 can then ensure that these “hot” data objects are storedin a redundant replication storage scheme, as frequent retrieval of“hot” objects that are large and stored in an erasure coding storagescheme can result in a significant performance penalty because of thecomputational and I/O resources that may be needed to reconstruct anerasure coded data object.

The various parameters regarding the specific erasure coding storagescheme as well as the redundant replication storages scheme can varydepending on the implementation of an embodiment of the disclosure.Additionally, a data storage system according to the disclosure canemploy a varying number of data store nodes 101 depending on cost,performance, and other factors. As one non-limiting example, a datastorage system according to the disclosure can mirror a data object copyamong three data store nodes when a redundant replication storage schemeis employed for a particular data object. The data storage system, inthis example, can also employ an erasure coding scheme where n=6 andk=3, meaning there can be six data object shards stored among six datastore nodes. Other variations should be appreciated by a person ofordinary skill in the art.

FIGS. 5-7 depict flowcharts that provide non-limiting examples of theoperation of a portion of the data storage application 105 according tovarious embodiments. It is understood that the flowcharts of FIGS. 5-9provides merely an example of the many different types of functionalarrangements that may be employed to implement the operation of theportion of the data storage application 105 as described herein. As analternative, the flowcharts of FIGS. 5-9 may be viewed as depictingexamples of steps of methods implemented in the computing device 103(FIG. 1) according to one or more embodiments.

FIG. 5 depicts one way in which the data storage application 105associated with a data storage system can employ a mix of redundantreplication as well as erasure coding storage schemes as describedherein. In the depicted embodiment, in box 501 the data storageapplication 105 can receive a data object request, which can include arequest to create, access and/or modify a data object in the datastorage system. In box 503, the data storage application 105 candetermine whether the data object is sized greater than a sizethreshold. If the data object size is not greater than the sizethreshold, the data storage application can determine whether the dataobject is stored in a redundant replication storage scheme in box 505.If the data object is not stored in the data storage system in aredundant replication storage scheme, the data storage application 105can store the object in a redundant replication scheme in box 507. Ifthe data object size is greater than the size threshold, the datastorage application 105 can determine whether the data object is storedin an erasure coding replication scheme in box 509. If the data objectis not stored in an erasure coding replication scheme, the data objectcan be stored in the erasure coding replication scheme in box 511.

FIG. 6 depicts an alternative way in which the data storage application105 associated with a data storage system can employ a mix of redundantreplication as well as erasure coding storage schemes as describedherein. In the depicted embodiment, in box 601 the data storageapplication 105 can receive a data object request, which can include arequest to create, access and/or modify a data object in the datastorage system. In box 603, the data storage application 105 candetermine whether the data object is sized greater than a sizethreshold. If the data object size is not greater than the sizethreshold, the data storage application can determine whether the dataobject is stored in a redundant replication storage scheme in box 605.If the data object is not stored in the data storage system in aredundant replication storage scheme, the data storage application 105can store the object in a redundant replication scheme in box 607.

If the data object size is greater than the size threshold, the datastorage application 105 can determine whether the data object isaccessed less often than an access threshold in box 609. If the dataobject is accessed more often than an access threshold, then the datastorage application 105 can proceed to boxes 605 and 607 as describedabove. If the data object is accessed less than an access threshold, thedata storage application 105 can determine whether the data object isstored in an erasure coding replication scheme in box 611. If the dataobject is not stored in an erasure coding replication scheme, the dataobject can be stored in the erasure coding replication scheme in box613.

Accordingly, FIGS. 5-6 represent methods in which the data storageapplication 105 can, on an object by object basis, assess whether aparticular data object that is the subject of a request to retrieve,create and/or modify the object is stored in the data storage systemusing the appropriate storage scheme. In contrast, FIG. 7 represents amethod in which the data storage application 105 can analyze the dataobjects in a data storage system on a periodic basis and calculatethresholds to determine whether data objects should be stored in aredundant replication storage scheme or an erasure coding storagescheme.

In FIG. 7, in box 701, the data storage application 105 can generate anobject size distribution. As described above, an object sizedistribution can be generated by scanning and/or sampling an index 111and/or log 113 to determine a distribution of data objects in the datastorage system according to their size. A size threshold can beidentified based at least upon this distribution. For example, a dataobject size representing the data object size above which represents tenpercent of data objects in the data storages system.

In box 703, the data storage application 105 can generate an accesspattern distribution. As described above, an access threshold can beidentified that identifies data objects accessed less than an accessthreshold. In box 705, the data storage application 105 can identifyobjects sized greater than the size threshold and in box 707, the datastorage application 105 can identify from these data objects those thatare accessed less than the access threshold. In box 709, these dataobjects that are greater than the size threshold and accessed less thanthe access threshold can be stored in an erasure coding scheme.

With reference to FIG. 8, shown is a schematic block diagram of thecomputing device 103 according to an embodiment of the presentdisclosure. The computing device 103 includes at least one processorcircuit, for example, having a processor 903 and a memory 906, both ofwhich are coupled to a local interface 909. To this end, the computingdevice 103 may comprise, for example, at least one server computer orlike device. The local interface 909 may comprise, for example, a databus with an accompanying address/control bus or other bus structure ascan be appreciated.

Stored in the memory 906 are both data and several components that areexecutable by the processor 903. In particular, stored in the memory 906and executable by the processor 903 are the data storage application105, and potentially other applications. In addition, an operatingsystem may be stored in the memory 906 and executable by the processor903.

It is understood that there may be other applications that are stored inthe memory 906 and are executable by the processors 903 as can beappreciated. Where any component discussed herein is implemented in theform of software, any one of a number of programming languages may beemployed such as, for example, C, C++, C#, Objective C, Java,Javascript, Perl, PHP, Visual Basic, Python, Ruby, Delphi, Flash, orother programming languages.

A number of software components are stored in the memory 906 and areexecutable by the processor 903. In this respect, the term “executable”means a program file that is in a form that can ultimately be run by theprocessor 903. Examples of executable programs may be, for example, acompiled program that can be translated into machine code in a formatthat can be loaded into a random access portion of the memory 906 andrun by the processor 903, source code that may be expressed in properformat such as object code that is capable of being loaded into a randomaccess portion of the memory 906 and executed by the processor 903, orsource code that may be interpreted by another executable program togenerate instructions in a random access portion of the memory 906 to beexecuted by the processor 903, etc. An executable program may be storedin any portion or component of the memory 906 including, for example,random access memory (RAM), read-only memory (ROM), hard drive,solid-state drive, USB flash drive, memory card, optical disc such ascompact disc (CD) or digital versatile disc (DVD), floppy disk, magnetictape, or other memory components.

The memory 906 is defined herein as including both volatile andnonvolatile memory and data storage components. Volatile components arethose that do not retain data values upon loss of power. Nonvolatilecomponents are those that retain data upon a loss of power. Thus, thememory 906 may comprise, for example, random access memory (RAM),read-only memory (ROM), hard disk drives, solid-state drives, USB flashdrives, memory cards accessed via a memory card reader, floppy disksaccessed via an associated floppy disk drive, optical discs accessed viaan optical disc drive, magnetic tapes accessed via an appropriate tapedrive, and/or other memory components, or a combination of any two ormore of these memory components. In addition, the RAM may comprise, forexample, static random access memory (SRAM), dynamic random accessmemory (DRAM), or magnetic random access memory (MRAM) and other suchdevices. The ROM may comprise, for example, a programmable read-onlymemory (PROM), an erasable programmable read-only memory (EPROM), anelectrically erasable programmable read-only memory (EEPROM), or otherlike memory device.

Also, the processor 903 may represent multiple processors 903 and thememory 906 may represent multiple memories 906 that operate in parallelprocessing circuits, respectively. In such a case, the local interface909 may be an appropriate network that facilitates communication betweenany two of the multiple processors 903, between any processor 903 andany of the memories 906, or between any two of the memories 906, etc.The local interface 909 may comprise additional systems designed tocoordinate this communication, including, for example, performing loadbalancing. The processor 903 may be of electrical or of some otheravailable construction.

Although the data storage application 105, and other various systemsdescribed herein may be embodied in software or code executed by generalpurpose hardware as discussed above, as an alternative the same may alsobe embodied in dedicated hardware or a combination of software/generalpurpose hardware and dedicated hardware. If embodied in dedicatedhardware, each can be implemented as a circuit or state machine thatemploys any one of or a combination of a number of technologies. Thesetechnologies may include, but are not limited to, discrete logiccircuits having logic gates for implementing various logic functionsupon an application of one or more data signals, application specificintegrated circuits having appropriate logic gates, or other components,etc. Such technologies are generally well known by those skilled in theart and, consequently, are not described in detail herein.

The flowcharts of FIGS. 5-7 show the functionality and operation of animplementation of portions of the data storage application 105. Ifembodied in software, each block may represent a module, segment, orportion of code that comprises program instructions to implement thespecified logical function(s). The program instructions may be embodiedin the form of source code that comprises human-readable statementswritten in a programming language or machine code that comprisesnumerical instructions recognizable by a suitable execution system suchas a processor 903 in a computer system or other system. The machinecode may be converted from the source code, etc. If embodied inhardware, each block may represent a circuit or a number ofinterconnected circuits to implement the specified logical function(s).

Although the FIGS. 5-7 show a specific order of execution, it isunderstood that the order of execution may differ from that which isdepicted. For example, the order of execution of two or more blocks maybe scrambled relative to the order shown. Also, two or more blocks shownin succession in FIGS. 5-7 may be executed concurrently or with partialconcurrence. Further, in some embodiments, one or more of the blocksshown in FIGS. 5-7 show may be skipped or omitted. In addition, anynumber of counters, state variables, warning semaphores, or messagesmight be added to the logical flow described herein, for purposes ofenhanced utility, accounting, performance measurement, or providingtroubleshooting aids, etc. It is understood that all such variations arewithin the scope of the present disclosure.

Also, any logic or application described herein, such as the datastorage application 105, that comprises software or code can be embodiedin any non-transitory computer-readable medium for use by or inconnection with an instruction execution system such as, for example, aprocessor 903 in a computer system or other system. In this sense, thelogic may comprise, for example, statements including instructions anddeclarations that can be fetched from the computer-readable medium andexecuted by the instruction execution system. In the context of thepresent disclosure, a “computer-readable medium” can be any medium thatcan contain, store, or maintain the logic or application describedherein for use by or in connection with the instruction executionsystem. The computer-readable medium can comprise any one of manyphysical media such as, for example, magnetic, optical, or semiconductormedia. More specific examples of a suitable computer-readable mediumwould include, but are not limited to, magnetic tapes, magnetic floppydiskettes, magnetic hard drives, memory cards, solid-state drives, USBflash drives, or optical discs. Also, the computer-readable medium maybe a random access memory (RAM) including, for example, static randomaccess memory (SRAM) and dynamic random access memory (DRAM), ormagnetic random access memory (MRAM). In addition, the computer-readablemedium may be a read-only memory (ROM), a programmable read-only memory(PROM), an erasable programmable read-only memory (EPROM), anelectrically erasable programmable read-only memory (EEPROM), or othertype of memory device.

It should be emphasized that the above-described embodiments of thepresent disclosure are merely possible examples of implementations setforth for a clear understanding of the principles of the disclosure.Many variations and modifications may be made to the above-describedembodiment(s) without departing substantially from the spirit andprinciples of the disclosure. All such modifications and variations areintended to be included herein within the scope of this disclosure andprotected by the following claims.

Therefore, the following is claimed:
 1. A non-transitorycomputer-readable medium embodying a program executable in a at leastone computing device, wherein the programcomprising, when executed,causes the at least one computing device to at least: code thatgeneratesgenerate an object size distribution of a plurality of dataobjects stored in a data storage system, the data storage systemcomprising at least one data store; periodically determine a sizethreshold to maintain a balance between data objects stored in a firstdata replication scheme and a second data replication scheme in the datastorage system based at least in part on a number of data objectsdistributed above or below the size threshold in the object sizedistribution; code that generatesgenerate an access pattern distributionof the plurality of data objects; periodically determine an accessfrequency threshold to maintain the balance between the data objectsstored in the first data replication scheme and the second datareplication scheme in the data storage system based at least in part ona number of data objects distributed above or below the access frequencythreshold in the access pattern distribution; code that identifies fromthe object size distributionidentify a first at least one object storedin the data storage system that is greater than athe size threshold, thefirst at least one object stored in athe first data replication scheme,the first data replication scheme comprising a redundant replicationscheme wherein a copy of the first at least one object is stored in aplurality of data stores in the data storage system; code thatidentifies from the access pattern distributionidentify whether thefirst at least one object is accessed less often than anthe accessfrequency thresholdfrequency; code that storesstore the first at leastone object in athe second data replication scheme in the data storagesystem when the first at least one object exceeds the size threshold andthe first at least one object is accessed less often than the accessfrequency threshold frequency over a period of time, the second datareplication scheme comprising an erasure coding scheme, wherein the atfirst least one data object is divided into a plurality of shards, eachof the plurality of shards having a size less than an object size of thefirst at least one object and stored in a respective plurality of datastores in the data storage system; code that identifies from the objectsize distributionidentify a second at least one object stored in thedata storage system that is less than the size threshold, the second atleast one object stored in the second data replication scheme; code thatidentifies from the access pattern distributionidentify whether thesecond at least one object is accessed more often than the accessfrequency thresholdfrequency; and code that storesstore the second atleast one object in the first data replication scheme in the datastorage system when the firstsecond at least one object is either lessthan the size threshold or the second at least one object is accessedmore often than the access frequency threshold frequency over a periodof time.
 2. A system, comprising: at least one computing device; and atleast one storage device that is accessible to the at least onecomputing device, the at least one storage device storing a data storageapplication executable in the at least one computing device, wherein thedata storage applicationcomprising, when executed, causes the at leastone computing device to at least: logic that generatesgenerate an objectsize distribution of a plurality of data objects stored in a datastorage system, the data storage system comprising at least one datastore; periodically determine a size threshold to maintain a balancebetween data objects stored in a first data replication scheme and asecond data replication scheme in the data storage system based at leastin part on a number of data objects distributed above or below the sizethreshold in the object size distribution; logic that generatesgeneratean access pattern distribution of the plurality of data objects;periodically determine an access frequency threshold to maintain thebalance between the data objects stored in the first data replicationscheme and the second data replication scheme in the data storage systembased at least in part on a number of data objects distributed above orbelow the access frequency threshold in the access pattern distribution;logic that identifies from the object size distributionidentify at leastone data object stored in the data storage system that is greater thanathe size threshold, the at least one data object stored in athe firstdata replication scheme, the first data replication scheme comprising aredundant replication scheme wherein a copy of the at least one dataobject is stored in a plurality of data stores in the data storagesystem; logic that identifies from the access patterndistributionidentify whether the at least one data object is accessedless often than anthe access frequency thresholdfrequency; and logicthat storesstore the at least one data object in athe second replicationscheme in the data storage system when the at least one data objectexceeds the size threshold and the at least one data object is accessedless often than the access frequency threshold frequency over a periodof time, the second replication scheme comprising an erasure codingscheme, wherein the at least one data object is divided into a pluralityof shards, each of the plurality of shards having a size less than anobject size of the at least one data object and stored in a respectiveplurality of data stores in the data storage system.
 3. The system ofclaim 2, wherein the plurality of shards have a total size greater thanor equal the at least one data object.
 4. The system of claim 2, whereinthe data storage application, when executed, further comprises logicthat stores causes the at least one computing device to at least storeone of the plurality of shards at least a subset of the at least onedata store.
 5. The system of claim 2, wherein the data storageapplication, when executed, further comprises causes the at least onecomputing device to at least: logic that identifiesidentify a locationof a subset of the shards in an index accessible to the data storageapplication; logic that retrievesretrieve the subset of the shards fromthe at least one data store; and logic that reconstructsreconstruct theat least one data object from the subset of the shards.
 6. The system ofclaim 2, wherein the logic that generates the access patterndistribution of the plurality of data objects further comprises datastorage application, when executed, further causes the at least onecomputing device to at least: logic that scansscan an access log of thedata storage system over a specified period of time; and logic thatidentifiesidentify at least one data object accessed within thespecified period of time.
 7. The system of claim 2, wherein the logicthat generates the access pattern distribution of the plurality of dataobjects further comprises data storage application, when executed,further causes the at least one computing device to at least: logic thatsamplessample an access log of the data storage system over a specifiedperiod of time; and logic that identifiesidentify at least one dataobject accessed within the specified period of time.
 8. The system ofclaim 2, wherein the logic that generates the access patterndistribution of the plurality of data objects further comprises datastorage application, when executed, further causes the at least onecomputing device to at least: logic that scansscan an index of aplurality of data objects stored in the data storage system, the indexspecifying a storage location in the data storage system of the objectsand a most recent access of at least one of the objects; and logic thatidentifiesidentify at least one data object accessed within a specifiedperiod of time.
 9. The system of claim 2, wherein the data storageapplicationfurther comprises, when executed, further causes the at leastone computing device to at least: logic that receivesreceive a requestto retrieve a data object from the data storage system; logic thatdeterminesdetermine whether a size of the data object is greater thanthe size threshold; logic that storesstore the data object according tothe second replication scheme when the size is greater than the sizethreshold; and logic that storesstore the data object according to thefirst data replication scheme when the size is less than the sizethreshold.
 10. The system of claim 2, wherein the data storageapplicationfurther comprises, when executed, further causes the at leastone computing device to at least: logic that receivesreceive a requestto retrieve a data object from the data storage system; logic thatdeterminesdetermine whether the data object has been accessed during theperiod of time more often than the access threshold frequency; logicthat storesstore the data object according to the second replicationscheme when the data object has been accessed less often than the accessthreshold frequency; and logic that storesstore the data objectaccording to the first data replication scheme when the data object hasbeen accessed during the period of time more often than the accessthreshold frequency.
 11. A method, comprisingthe steps of: receiving, inat least one computing device, a request to retrieve a data object froma data storage system comprising at least one data store; logging, inthe at least one computing device, the request in an access logaccessible to the at least one computing device; periodicallydetermining, in the at least one computing device, a size threshold tomaintain a balance between data objects stored in a first replicationscheme and a second replication scheme in the data storage system basedat least in part on a number of data objects distributed above or belowthe size threshold in an object size distribution for the data storagesystem; determining, in the at least one computing device, whether thata size of the data object exceeds a the size threshold; determining, inthe at least one computing device, whether that the data object isstored in a the first replication scheme in the data storage system, thefirst replication scheme comprising a redundant replication schemewherein a copy of the data object is stored in a plurality of datastores in the data storage system; and encoding, in the at least onecomputing device, the data object in a the second replication schemewhen in response to determining that the size exceeds the sizethreshold, the second replication scheme comprising an erasure codingscheme, wherein the data object is divided into a plurality of shards,each of the plurality of shards having a size less than an object sizeof the data object and stored in a respective plurality of data storesin the data storage system; and storing, in the data storage system, thedata object in the second replication scheme.
 12. The method of claim11, further comprising the steps of: determining, in the at least onecomputing device, whether the size of the data object is less than thesize threshold; determining, in the at least one computing device,whether the data object is stored in the second replication scheme inthe data storage system; encoding, in the at least one computing device,the data object in the first replication scheme when the size is lessthan the size threshold; and storing, in the data storage system, thedata object in the first replication scheme.
 13. The method of claim 11,further comprising the steps of: determining, in the at least onecomputing device, whether the data object has been accessed less oftenthan an access threshold over a period of time; and storing, in the datastorage system, the data object in the second replication scheme if thedata object has been accessed less often than the access threshold andthe size exceeds the size threshold.
 14. The method of claim 11, whereinthe step of encoding the data object in the second replication schemefurther comprises further comprising: dividing the data object into Mfragments; generating N fragments from the M fragments; and storing theN fragments and the M fragments in the at least one data store.
 15. Themethod of claim 14, wherein the step of encoding the data object in thesecond replication scheme further comprises generating an indexdescribing a location of the N fragments and the M fragmentscorresponding to the data object.
 16. The method of claim 14, furthercomprising the step of reconstructing, in the at least one computingdevice, the data object using an erasure coding algorithm, the dataobject reconstructed using a first number of the N fragments and the Mfragments, wherein the first number is at least equal to M.
 17. Themethod of claim 11, wherein the step of encoding the data object in thesecond replication scheme further comprises further comprisinggenerating, in the at least one computing device, N fragments from thedata object, a total storage size of the N fragments being greater thanor equal to a size of the data object.
 18. The method of claim 17,wherein the step of storing encoding the data object in the secondreplication scheme further comprises storing each one of the N fragmentsin a different one of the respective plurality of data stores in thedata storage system.
 19. The method of claim 17, further comprising thestep of reconstructing, in the at least one computing device, the dataobject from a subset of the N fragments.
 20. The method of claim 19,wherein the step of reconstructing the data object from the subset ofthe N fragments further comprises the step of retrieving a fragment fromeach of a subset of the respective plurality of data stores in the datastorage system.
 21. A system, comprising: at least one computing device;and at least one storage device that is accessible to the at least onecomputing device, the at least one storage device storing a data storageapplication executable in the at least one computing device, the datastorage application configured to cause the at least one computingdevice to at least: receive a request to store a data object in a datastorage system, the data storage system comprising a plurality of datastores; generate an object size distribution for the data storagesystem; periodically determine a size threshold to maintain a balancebetween data objects stored in a redundant replication scheme and anerasure coding scheme in the data storage system based at least in parton a number of data objects distributed above or below the sizethreshold in the object size distribution; determine whether an objectsize of the data object meets the size threshold; store the data objectin the data storage system using the redundant replication scheme whenthe object size fails to meet the size threshold, wherein the redundantreplication scheme comprises a first scheme in which a copy of the dataobject is stored in a first at least two of the plurality of datastores; and store the data object using the erasure coding scheme in thedata storage system when the object size meets the size threshold, theerasure coding scheme comprising a second scheme in which the dataobject is divided into a plurality of shards, individual ones of theplurality of shards having a size less than the object size and storedin a second at least two of the plurality of data stores.
 22. The systemof claim 21, wherein the data storage application is further configuredto cause the at least one computing device to at least: determinewhether an access frequency of the data object meets an access frequencythreshold; store the data object in the data storage system using theredundant replication scheme when the access frequency meets the accessfrequency threshold; and store the data object in the data storagesystem using the erasure coding scheme when the access frequency doesnot meet the access frequency threshold and the object size meets thesize threshold.
 23. The system of claim 21, wherein the plurality ofshards have a total combined size that is greater than or equal to theobject size.
 24. The system of claim 21, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least store one of the plurality of shards in a respectiveone of the second at least two of the plurality of data stores.
 25. Thesystem of claim 21, wherein the data storage application is furtherconfigured to cause the at least one computing device to at least:identify a location of a subset of the plurality of shards in an indexaccessible to the data storage application; retrieve the subset of theplurality of shards; and reconstruct the data object from the subset ofthe shards.
 26. The system of claim 21, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least: generate an access pattern distribution for the datastorage system; identify another data object stored in the data storagesystem having another object size that meets the size threshold, theother data object stored using the redundant replication scheme;identify whether an access frequency associated with the other dataobject fails to meet an access frequency threshold; and store the otherdata object using the erasure coding scheme responsive to identifyingthat the other object size meets the size threshold and the accessfrequency does not meet the access frequency threshold.
 27. The systemof claim 26, wherein the access pattern distribution of the plurality ofdata objects is generated by: scanning an access log of the data storagesystem over a specified period of time; and identifying at least oneentry corresponding to the other data object within the access log overthe specified period of time.
 28. The system of claim 27, wherein theaccess log is scanned by sampling the access log over the specifiedperiod of time.
 29. The system of claim 26, wherein the access patterndistribution of the plurality of data objects is generated by: scanningan index of the plurality of data objects stored in the data storagesystem, the index specifying a storage location in the data storagesystem of the plurality of data objects and a most recent access of atleast one of the data objects; and identifying at least one entry in theindex corresponding to the other data object within a specified periodof time.
 30. The system of claim 21, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least: receive a request to retrieve another data objectfrom the data storage system; determine whether another size of theother data object meets the size threshold; store the other data objectusing the erasure coding scheme when the other size meets the sizethreshold; and store the data object using the redundant replicationscheme when the other size does not meet the size threshold.
 31. Thesystem of claim 21, wherein the data storage application is furtherconfigured to cause the at least one computing device to at least:receive a request to retrieve another data object from the data storagesystem; determine whether an access frequency associated with the otherdata object meets an access frequency threshold; store the other dataobject using the erasure coding scheme when the access frequency doesnot meet the access frequency threshold; and store the other data objectaccording to the redundant replication scheme when the access frequencymeets the access frequency threshold.
 32. A method, comprising:obtaining, by at least one computing device, a request to store a dataobject in a data storage system comprising a plurality of data stores;generating, by the at least one computing device, an object sizedistribution for the data storage system; periodically determining, bythe at least one computing device, a size threshold to maintain abalance between data objects stored in a redundant replication schemeand an erasure coding scheme in the data storage system based at leastin part on a number of data objects distributed above or below the sizethreshold in the object size distribution; determining, by the at leastone computing device, that an object size of the data object fails tomeet the size threshold; storing, by the at least one computing device,the data object in the data storage system using the redundantreplication scheme instead of the erasure coding scheme responsive todetermining that the object size fails to meet the size threshold,wherein the redundant replication scheme comprises a scheme in which acopy of the data object is stored in a first at least two of theplurality of data stores; and storing, by the at least one computingdevice, another data object in the data storage system using the erasurecoding scheme, wherein the other data object is divided into a pluralityof shards, each of the plurality of shards having a size less than anobject size of the other data object and stored in a respectiveplurality of data stores in the data storage system.
 33. The method ofclaim 32, further comprising: determining, by the at least one computingdevice, that an access frequency associated with the data object meetsan access frequency threshold over a period of time; determining, by theat least one computing device, that the data object is stored in theerasure coding scheme; and in response to determining that the accessfrequency associated with the data object meets the access frequencythreshold, the data object is stored in the erasure coding scheme, andthe object size meets the size threshold: encoding, by the at least onecomputing device, the data object using the redundant replicationscheme; and storing, by the at least one computing device, the dataobject using the redundant replication scheme in the data storagesystem.
 34. A system, comprising: at least one computing device; and atleast one storage device that is accessible to the at least onecomputing device, the at least one storage device storing a data storageapplication executable in the at least one computing device, the datastorage application configured to cause the at least one computingdevice to at least: generate an access pattern distribution of aplurality of data objects stored in a data storage system, the datastorage system comprising at least one data store; periodicallydetermine an access frequency threshold to maintain a balance betweendata objects stored in a redundant replication scheme and an erasurecoding scheme in the data storage system based at least in part on anumber of data objects distributed above or below the access frequencythreshold in the access pattern distribution; identify a data objectstored in the data storage system having an access frequency that failsto meet the access frequency threshold, the data object stored in thedata storage system using the redundant replication scheme wherein acopy of the data object is stored in a plurality of data stores in thedata storage system; and store the data object in the data storagesystem in the erasure coding scheme, wherein the data object is dividedinto a plurality of shards, each of the plurality of shards having asize of less than an object size of the data object, each of theplurality of shards being stored in a respective plurality of datastores in the data storage system.
 35. The system of claim 34, whereinthe data storage application is further configured to cause the at leastone computing device to at least: identify another data object from theaccess pattern distribution that is stored in the data storage systemhaving another access frequency that meets the access frequencythreshold, the other data object stored in the data storage system usingthe erasure coding scheme; and store the other data object in the datastorage system using the redundant replication scheme.
 36. The system ofclaim 34, wherein the plurality of shards have a total size greater thanor equal the data object.
 37. The system of claim 34, wherein the datastorage application is further configured to cause the at least onecomputing device to at least store one of the plurality of shards atleast a subset of the at least one data store.
 38. The system of claim34, wherein the data storage application is further configured to causethe at least one computing device to at least: identify a location of asubset of the shards in an index accessible to the data storageapplication; retrieve the subset of the shards from the at least onedata store; and reconstruct the data object from the subset of theshards.
 39. The system of claim 34, wherein the access patterndistribution of the plurality of data objects is generated by: scanningan access log of the data storage system over a specified period oftime; and identifying the data object accessed within the specifiedperiod of time.
 40. The system of claim 34, wherein the access patterndistribution of the plurality of data objects is generated by: samplingan access log of the data storage system over a specified period oftime; and identifying the data object accessed within the specifiedperiod of time.
 41. The system of claim 34, wherein the access patterndistribution of the plurality of data objects is generated by: scanningan index of a plurality of data objects stored in the data storagesystem, the index specifying a storage location in the data storagesystem of the data objects and a most recent access of at least one ofthe data objects; and identifying the data object accessed within aspecified period of time.
 42. A method comprising: obtaining, in atleast one computing device, a request to retrieve a data object from adata storage system comprising at least one data store; logging, in theat least one computing device, the request in an access log accessibleto the at least one computing device; determining, in the at least onecomputing device, that an access frequency of the data object fails tomeet an access frequency threshold; determining, in the at least onecomputing device, that the data object is stored in a first replicationscheme in the data storage system, the first replication schemecomprising a redundant replication scheme wherein a copy of the dataobject is stored in a plurality of data stores in the data storagesystem; generating, by the at least one computing device, an object sizedistribution for the data storage system; periodically determining, bythe at least one computing device, a size threshold to maintain abalance between data objects stored in the first replication scheme anda second replication scheme in the data storage system based at least inpart on a number of data objects distributed above or below the sizethreshold in the object size distribution; and encoding, in the at leastone computing device, the data object in the second replication schemeresponsive to determining that the access frequency fails to meet theaccess frequency threshold and that a size of the data object meets thesize threshold, the second replication scheme comprising an erasurecoding scheme, wherein the data object is divided into a plurality ofshards, each of the plurality of shards having a size less than anobject size of the data object and stored in a respective plurality ofdata stores in the data storage system.
 43. The method of claim 42,wherein encoding the data object in the second replication schemefurther comprises: dividing the data object into M fragments; generatingN fragments from the M fragments; and storing the N fragments and the Mfragments in the at least one data store.
 44. The method of claim 43,wherein the step of encoding the data object in the second replicationscheme further comprises generating an index describing a location ofthe N fragments and the M fragments corresponding to the data object.45. The method of claim 43, further comprising reconstructing the dataobject using an erasure coding algorithm, the data object reconstructedusing a first number of the N fragments and the M fragments, wherein thefirst number is at least equal to M.
 46. The method of claim 42, whereinencoding the data object in the second replication scheme furthercomprises generating N fragments from the data object, a total storagesize of the N fragments being greater than or equal to a size of thedata object.
 47. The method of claim 46, wherein encoding the dataobject in the second replication scheme further comprises storing eachone of the N fragments in a different one of the respective plurality ofdata stores in the data storage system.
 48. A system, comprising: atleast one computing device; and at least one storage device that isaccessible to the at least one computing device, the at least onestorage device storing a data storage application executable in the atleast one computing device, the data storage application configured tocause the at least one computing device to at least: generate an objectsize distribution of a plurality of data objects stored in a datastorage system, the data storage system comprising at least one datastore; generate an access pattern distribution of the plurality of dataobjects; periodically determine a size threshold to maintain a balancebetween data objects stored in a first replication scheme and a secondreplication scheme in the data storage system based at least in part ona number of data objects distributed above or below the size thresholdin the object size distribution; identify at least one data objectstored in the data storage system that fails to meet the size threshold,the at least one data object stored in the first replication scheme, thefirst replication scheme comprising an erasure coding scheme in whichthe at least one data object is divided into a plurality of shards, eachof the plurality of shards having a size less than an object size of theat least one data object and stored in a respective plurality of datastores in the data storage system; identify that the at least one dataobject is accessed more often than an access frequency threshold; andstore the at least one data object in the second replication scheme inthe data storage system responsive to determining that the at least onedata object fails to meet the size threshold and the at least one dataobject is accessed more often than the access frequency threshold over aperiod of time, the second replication scheme comprising a redundantreplication scheme in which a copy of the at least one data object isstored in another respective plurality of data stores in the datastorage system.
 49. The system of claim 48, wherein the plurality ofshards have a total size greater than or equal the at least one dataobject.
 50. The system of claim 48, wherein the data storage applicationis further configured to cause the at least one computing device to atleast store at least one of the plurality of shards in the at least onedata store.
 51. The system of claim 48, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least: identify a location of a subset of the shards in anindex accessible to the data storage application; retrieve the subset ofthe shards from the at least one data store; and reconstruct the atleast one data object from the subset of the shards.
 52. The system ofclaim 48, wherein the access pattern distribution of the plurality ofdata objects is generated by: scanning an access log of the data storagesystem over a specified period of time; and identifying at least onedata object accessed within the specified period of time.
 53. The systemof claim 48, wherein the access pattern distribution of the plurality ofdata objects is generated by: sampling an access log of the data storagesystem over a specified period of time; and identifying at least onedata object accessed within the specified period of time.
 54. The systemof claim 48, wherein the access pattern distribution of the plurality ofdata objects is generated by: scanning an index of a plurality of dataobjects stored in the data storage system, the index specifying astorage location in the data storage system of the objects and a mostrecent access of at least one of the objects; and identifying at leastone data object accessed within a specified period of time.
 55. Thesystem of claim 48, wherein the data storage application is furtherconfigured to cause the at least one computing device to at least:receive a request to retrieve a data object from the data storagesystem; determine whether a size of the data object meets the sizethreshold; store the data object according to the first replicationscheme responsive to determining that the size of the data object meetsthe size threshold and the data object was previously stored accordingto the second replication scheme; and store the data object according tothe second replication scheme responsive to determining that the size ofthe data object fails to meet the size threshold and the data object waspreviously stored according to the first replication scheme.
 56. Thesystem of claim 48, wherein the data storage application is furtherconfigured to cause the at least one computing device to at least:receive a request to retrieve a data object from the data storagesystem; determine whether an access frequency of the data object meetsthe access frequency threshold; store the data object according to thesecond replication scheme responsive to determining that the accessfrequency meets the access frequency threshold and the data object waspreviously stored according to the first replication scheme; and storethe data object according to the first replication scheme when theaccess frequency fails to meet the access frequency threshold and thedata object was previously stored according to the second replicationscheme.
 57. A method comprising: receiving, in at least one computingdevice, a request to retrieve a data object from a data storage systemcomprising at least one data store; logging, in the at least onecomputing device, the request in an access log accessible to the atleast one computing device; generating, in the at least one computingdevice, an object size distribution for the data storage system;periodically determining, in the at least one computing device, a sizethreshold to maintain a balance between data objects stored in a firstreplication scheme and a second replication scheme in the data storagesystem based at least in part on a number of data objects distributedabove or below the size threshold in the object size distribution;determining, in the at least one computing device, that the object sizeof the data object fails to meet the size threshold; determining, in theat least one computing device, that the data object is stored in thefirst replication scheme in the data storage system, the firstreplication scheme comprising an erasure coding scheme, wherein the dataobject is divided into a plurality of shards, each of the plurality ofshards having a size less than the object size of the data object andstored in a respective plurality of data stores in the data storagesystem; and encoding, in the at least one computing device, the dataobject in the second replication scheme responsive to determining thatthe size fails to meet the size threshold and the data object is storedaccording to the first replication scheme, the second replication schemecomprising a redundant replication scheme wherein a copy of the dataobject is stored in another respective plurality of data stores in thedata storage system.
 58. The method of claim 57, further comprising:determining, in the at least one computing device, whether an accessfrequency of the data object meets an access frequency threshold over aperiod of time; and encoding, in the data storage system, the dataobject in the second replication scheme responsive to determining thatthe access frequency meets the access frequency threshold and the dataobject was previously stored using the first replication scheme.
 59. Anon-transitory computer-readable medium embodying a program executableby at least one computing device, the program, when executed, configuredto cause at least one computing device to at least: receive a request tostore a data object in a data storage system, the data storage systemcomprising a plurality of data stores; determine whether a size of thedata object meets a size threshold that is periodically determined tomaintain a balance between data objects stored in a redundantreplication scheme and an erasure coding scheme in the data storagesystem and depends upon a number of data objects distributed above orbelow the size threshold in an object size distribution of a pluralityof data objects stored in the data storage system; store the data objectin the data storage system using the redundant replication scheme inresponse to the size of the data object not meeting the size threshold,wherein the redundant replication scheme comprises a first scheme inwhich a copy of the data object is stored in a first at least two of theplurality of data stores; and store the data object using the erasurecoding scheme in the data storage system in response to the size of thedata object meeting the size threshold, the erasure coding schemecomprising a second scheme in which the data object is divided into aplurality of shards, individual ones of the plurality of shards having asize less than an object size and stored in a second at least two of theplurality of data stores.
 60. The non-transitory computer-readablemedium of claim 59, wherein the plurality of shards have a totalcombined size that is greater than or equal to the object size.
 61. Thenon-transitory computer-readable medium of claim 59, wherein the programis further configured to cause the at least one computing device to atleast store one of the plurality of shards in a respective one of thesecond at least two of the plurality of data stores.
 62. Thenon-transitory computer-readable medium of claim 59, wherein the programis further configured to cause the at least one computing device to atleast: identify a location of a subset of the plurality of shards in anindex accessible to the data storage application; retrieve the subset ofthe plurality of shards; and reconstruct the data object from the subsetof the shards.
 63. The non-transitory computer-readable medium of claim59, wherein the program is further configured to cause the at least onecomputing device to at least: generate the object size distribution;generate an access pattern distribution of the plurality of dataobjects; identify another data object stored in the data storage systemhaving another object size that meets the size threshold, the other dataobject stored using the redundant replication scheme; identify, from theaccess pattern distribution, whether an access frequency associated withthe other data object fails to meet an access frequency threshold; andstore the other data object using the erasure coding scheme when theother object size meets the size threshold and the access frequency doesnot meet the access frequency threshold.
 64. The non-transitorycomputer-readable medium of claim 63, wherein the access patterndistribution of the plurality of data objects is generated by: scanningan access log of the data storage system over a specified period oftime; and identifying at least one entry corresponding to the other dataobject within the access log over the specified period of time.
 65. Thenon-transitory computer-readable medium of claim 64, wherein the accesslog is scanned by sampling the access log over the specified period oftime.
 66. The non-transitory computer-readable medium of claim 63,wherein the access pattern distribution of the plurality of data objectsis generated by: scanning an index of the plurality of data objectsstored in the data storage system, the index specifying a storagelocation in the data storage system of the plurality of data objects anda most recent access of at least one of the data objects; andidentifying at least one entry in the index corresponding to the otherdata object within a specified period of time.
 67. A system, comprising:at least one computing device; and at least one storage device that isaccessible to the at least one computing device, the at least onestorage device storing a data storage application executable in the atleast one computing device, the data storage application configured tocause the at least one computing device to at least: receive a requestto store a data object in a data storage system, the data storage systemcomprising a plurality of data stores; determine whether a size of thedata object meets a size threshold that is periodically determined tomaintain a balance between data objects stored in a redundantreplication scheme and an erasure coding scheme in the data storagesystem and depends upon a number of data objects distributed above orbelow the size threshold in an object size distribution of a pluralityof data objects stored in the data storage system; store the data objectin the data storage system using the redundant replication scheme inresponse to the size of the data object not meeting the size threshold,wherein the redundant replication scheme comprises a first scheme inwhich a copy of the data object is stored in a first at least two of theplurality of data stores; and store the data object using the erasurecoding scheme in the data storage system in response to the size of thedata object meeting the size threshold, the erasure coding schemecomprising a second scheme in which the data object is divided into aplurality of shards, individual ones of the plurality of shards having asize less than an object size and stored in a second at least two of theplurality of data stores.
 68. The system of claim 67, wherein theplurality of shards have a total combined size that is greater than orequal to the object size.
 69. The system of claim 67, wherein the datastorage application is further configured to cause the at least onecomputing device to at least store one of the plurality of shards in arespective one of the second at least two of the plurality of datastores.
 70. The system of claim 67, wherein the data storage applicationis further configured to cause the at least one computing device to atleast: identify a location of a subset of the plurality of shards in anindex accessible to the data storage application; retrieve the subset ofthe plurality of shards; and reconstruct the data object from the subsetof the shards.
 71. The system of claim 67, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least: generate the object size distribution; generate anaccess pattern distribution of the plurality of data objects; identify,from the object size distribution, another data object stored in thedata storage system having another object size that meets the sizethreshold, the other data object stored using the redundant replicationscheme; identify, from the access pattern distribution, whether anaccess frequency associated with the other data object fails to meet anaccess frequency threshold; and store the other data object using theerasure coding scheme when the other object size meets the sizethreshold and the access frequency does not meet the access frequencythreshold.
 72. The system of claim 71, wherein the access patterndistribution of the plurality of data objects is generated by: scanningan access log of the data storage system over a specified period oftime; and identifying at least one entry corresponding to the other dataobject within the access log over the specified period of time.
 73. Thesystem of claim 72, wherein the access log is scanned by sampling theaccess log over the specified period of time.
 74. The system of claim71, wherein the access pattern distribution of the plurality of dataobjects is generated by: scanning an index of the plurality of dataobjects stored in the data storage system, the index specifying astorage location in the data storage system of the plurality of dataobjects and a most recent access of at least one of the data objects;and identifying at least one entry in the index corresponding to theother data object within a specified period of time.
 75. The system ofclaim 67, wherein the data storage application is further configured tocause the at least one computing device to at least: receive a requestto retrieve another data object from the data storage system; determinewhether another size of the other data object meets the size threshold;store the other data object using the erasure coding scheme when theother size meets the size threshold; and store the data object using theredundant replication scheme when the other size does not meet the sizethreshold.
 76. The system of claim 67, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least: receive a request to retrieve another data objectfrom the data storage system; determine whether an access frequencyassociated with the other data object meets an access frequencythreshold; store the other data object using the erasure coding schemewhen the access frequency does not meet the access frequency threshold;and store the other data object according to the redundant replicationscheme when the access frequency meets the access frequency threshold.77. A method, comprising: obtaining, by at least one computing device, arequest to store a data object in a data storage system comprising aplurality of data stores; determining, by the at least one computingdevice, that a size of the data object meets a size threshold that isperiodically determined to maintain a balance between data objectsstored in a redundant replication scheme and an erasure coding scheme inthe data storage system and depends upon a number of data objectsdistributed above or below the size threshold in an object sizedistribution of a plurality of data objects stored in the data storagesystem; storing, by the at least one computing device, the data objectusing the erasure coding scheme in the data storage system instead ofthe redundant replication scheme in response to the size of the dataobject meeting the size threshold, the erasure coding scheme comprisinga scheme in which the data object is divided into a plurality of shards,individual ones of the plurality of shards having a size less than anobject size and stored in at least two of the plurality of data stores;and storing, by the at least one computing device, another data objectin the data storage system using the redundant replication scheme,wherein a copy of the other data object is stored in another at leasttwo of the plurality of data stores in the data storage system.
 78. Themethod of claim 77, further comprising: dividing the data object into Mfragments; generating N fragments from the M fragments; and storing theN fragments and the M fragments in the at least two of the plurality ofdata stores.
 79. The method of claim 78, further comprisingreconstructing the data object using a first number of fragmentsselected from the N fragments and the M fragments, wherein the firstnumber is less than N+M.
 80. The method of claim 77, further comprisinggenerating N fragments from the data object, wherein a total size of theN fragments is at least as great as the object size.
 81. The method ofclaim 80, wherein the storing the data object in the erasure codingscheme further comprises storing individual ones of the N fragments indifferent ones of the at least two of the plurality of data stores inthe data storage system.
 82. The method of claim 81, further comprisingreconstructing the data object from a subset of the N fragments.
 83. Asystem, comprising: at least one computing device; and at least onestorage device that is accessible to the at least one computing device,the at least one storage device storing a data storage applicationexecutable in the at least one computing device, the data storageapplication configured to cause the at least one computing device to atleast: perform an analysis of a data storage system to generate anobject size distribution for the data storage system, the data storagesystem comprising at least one data store; periodically determine a sizethreshold to maintain a balance between data objects stored in a firstdata replication scheme and a second data replication scheme in the datastorage system based at least in part on a number of data objectsdistributed above or below the size threshold in the object sizedistribution; identify whether a data object is stored in the datastorage system in at least one of the first data replication scheme orthe second data replication scheme, the first data replication schemecomprising a redundant replication scheme wherein a copy of the dataobject is stored in a plurality of data stores in the data storagesystem and the second data replication scheme comprising an erasurecoding scheme, wherein the data object is divided into a plurality ofshards, each of the plurality of shards having a size less than anobject size of the data object and stored in a respective plurality ofdata stores in the data storage system; determine whether an object sizeof the data object meets the size threshold; and store the data objectin one of the first data replication scheme or the second datareplication scheme in the data storage system based at least in partupon whether the data object meets the size threshold.
 84. The system ofclaim 83, wherein the data storage application is further configured tocause the at least one computing device to store the data object in theone of the first data replication scheme or the second data replicationscheme further based at least in part upon an access frequency of thedata object.
 85. The system of claim 83, wherein the data object isstored in the second data replication scheme by: dividing the dataobject into M fragments; generating N fragments from the M fragments;and storing the N fragments and the M fragments in the respectiveplurality of data stores.
 86. The system of claim 85, wherein the datastorage application is further configured to reconstruct the data objectusing a first number of fragments selected from the N fragments and theM fragments, wherein the first number is less than N+M.
 87. The systemof claim 83, wherein the data object is stored in the second datareplication scheme by generating N fragments from the data object,wherein a total size of the N fragments is at least as great as theobject size.
 88. The system of claim 87, wherein the data object isstored in the second data replication scheme by storing individual onesof the N fragments in different ones of at least two plurality of datastores in the data storage system.
 89. The system of claim 88, whereinthe data storage application is further configured to reconstruct thedata object from a subset of the N fragments.
 90. A method, comprising:performing, by at least one computing device, an analysis of a datastorage system to generate an object size distribution for the datastorage system, the data storage system comprising at least one datastore; periodically determining, by the at least one computing device, asize threshold to maintain a balance between data objects stored in afirst data replication scheme and a second data replication scheme inthe data storage system based at least in part on a number of dataobjects distributed above or below the size threshold in the object sizedistribution; identifying, by the at least one computing device, whethera data object is stored in the data storage system in at least one ofthe first data replication scheme or the second data replication scheme,the first data replication scheme comprising a redundant replicationscheme wherein a copy of the data object is stored in a plurality ofdata stores in the data storage system and the second data replicationscheme comprising an erasure coding scheme, wherein the data object isdivided into a plurality of shards, each of the plurality of shardshaving a size less than an object size of the data object and stored ina respective plurality of data stores in the data storage system;determining, by the at least one computing device, whether an objectsize of the data object meets the size threshold; and storing, by the atleast one computing device, the data object in one of the first datareplication scheme or the second data replication scheme in the datastorage system based at least in part upon whether the data object meetsthe size threshold.
 91. The method of claim 90, wherein the at least onecomputing device stores the data object in the one of the first datareplication scheme or the second data replication scheme further basedat least in part upon an access frequency of the data object.
 92. Themethod of claim 90, wherein the data object is stored in the second datareplication scheme by: dividing the data object into M fragments;generating N fragments from the M fragments; and storing the N fragmentsand the M fragments in at least two data stores.
 93. The method of claim92, further comprising reconstructing the data object using a firstnumber of fragments selected from the N fragments and the M fragments,wherein the first number is less than N+M.
 94. The method of claim 90,wherein the data object is stored in the second data replication schemeby generating N fragments from the data object, wherein a total size ofthe N fragments is at least as great as the object size.
 95. The methodof claim 94, wherein the data object is stored in the second datareplication scheme by storing individual ones of the N fragments indifferent ones of at least two plurality of data stores in the datastorage system.
 96. The system of claim 95, further comprisingreconstructing the data object from a subset of the N fragments.
 97. Asystem, comprising: at least one computing device; and at least onestorage device that is accessible to the at least one computing device,the at least one storage device storing a data storage applicationexecutable in the at least one computing device, the data storageapplication configured to cause the at least one computing device to atleast: generate an object size distribution of a plurality of dataobjects stored in a data storage system, the data storage systemcomprising at least one data store; periodically determine a sizethreshold to maintain a balance between data objects stored in aredundant replication scheme and an erasure coding scheme in the datastorage system based at least in part upon a number of data objectsdistributed above or below the size threshold in the object sizedistribution; identify a data object stored in the data storage systemhaving an object size that meets the size threshold, the data objectstored in the data storage system using the redundant replicationscheme, wherein a copy of the data object is stored in a plurality ofdata stores in the data storage system; and store the data object in thedata storage system in the erasure coding scheme, wherein the dataobject is divided into a plurality of shards, each of the plurality ofshards having a size of less than an object size of the data object,each of the plurality of shards being stored in a respective pluralityof data stores in the data storage system.
 98. The system of claim 97,wherein the data storage application is further configured to cause theat least one computing device to at least: identify another data objectfrom the object size distribution that is stored in the data storagesystem having another object size that fails to meet the object sizethreshold, the other data object stored in the data storage system usingthe erasure coding scheme; and store the other data object in the datastorage system in the redundant replication scheme.
 99. The system ofclaim 97, wherein the plurality of shards have a total size greater thanor equal the data object.
 100. The system of claim 97, wherein the datastorage application is further configured to cause the at least onecomputing device to at least store one of the plurality of shards in atleast a subset of the at least one data store.
 101. The system of claim97, wherein the data storage application is further configured to causethe at least one computing device to at least: identify a location of asubset of the shards in an index accessible to the data storageapplication; retrieve the subset of the shards from the at least onedata store; and reconstruct the data object from the subset of theshards.
 102. The system of claim 97, wherein the data storageapplication is further configured to cause the at least one computingdevice to at least generate an access pattern distribution of theplurality of data objects by: sampling an access log of the data storagesystem over a specified period of time; and identifying the data objectaccessed within the specified period of time.
 103. The system of claim97, wherein the data storage application is further configured to causethe at least one computing device to at least generate an access patterndistribution of the plurality of data objects by: scanning an index of aplurality of data objects stored in the data storage system, the indexspecifying a storage location in the data storage system of the dataobjects and a most recent access of at least one of the data objects;and identifying the data object accessed within a specified period oftime.
 104. The system of claim 97, wherein the data storage applicationis further configured to cause the at least one computing device to atleast: determine whether an access frequency of the data object meets anaccess frequency threshold; store the data object according to theerasure coding scheme when the object size meets the size threshold andthe access frequency fails to meet the access frequency threshold; andstore the data object according to the redundant replication scheme whenthe access frequency meets the access frequency threshold.